Goto

Collaborating Authors

 Bulloch County


Outlier Detection of Poisson-Distributed Targets Using a Seabed Sensor Network

Kim, Mingyu, Stilwell, Daniel, Jimenez, Jorge

arXiv.org Artificial Intelligence

This paper presents a framework for classifying and detecting spatial commission outliers in maritime environments using seabed acoustic sensor networks and log Gaussian Cox processes (LGCPs). By modeling target arrivals as a mixture of normal and outlier processes, we estimate the probability that a newly observed event is an outlier. We propose a second-order approximation of this probability that incorporates both the mean and variance of the normal intensity function, providing improved classification accuracy compared to mean-only approaches. We analytically show that our method yields a tighter bound to the true probability using Jensen's inequality. To enhance detection, we integrate a real-time, near-optimal sensor placement strategy that dynamically adjusts sensor locations based on the evolving outlier intensity. The proposed framework is validated using real ship traffic data near Norfolk, Virginia, where numerical results demonstrate the effectiveness of our approach in improving both classification performance and outlier detection through sensor deployment.


An Improved Transformer-based Model for Detecting Phishing, Spam, and Ham: A Large Language Model Approach

Jamal, Suhaima, Wimmer, Hayden

arXiv.org Artificial Intelligence

Phishing and spam detection is long standing challenge that has been the subject of much academic research. Large Language Models (LLM) have vast potential to transform society and provide new and innovative approaches to solve well-established challenges. Phishing and spam have caused financial hardships and lost time and resources to email users all over the world and frequently serve as an entry point for ransomware threat actors. While detection approaches exist, especially heuristic-based approaches, LLMs offer the potential to venture into a new unexplored area for understanding and solving this challenge. LLMs have rapidly altered the landscape from business, consumers, and throughout academia and demonstrate transformational potential for the potential of society. Based on this, applying these new and innovative approaches to email detection is a rational next step in academic research. In this work, we present IPSDM, our model based on fine-tuning the BERT family of models to specifically detect phishing and spam email. We demonstrate our fine-tuned version, IPSDM, is able to better classify emails in both unbalanced and balanced datasets. This work serves as an important first step towards employing LLMs to improve the security of our information systems.


Overview Analysis of Recent Developments on Self-Driving Electric Vehicles

Ajao, Qasim, Sadeeq, Lanre

arXiv.org Artificial Intelligence

In recent years, the development of autonomous electric vehicles (AEVs) has gained significant attention from researchers and engineers worldwide. AEVs are expected to revolutionize the way we commute and transport goods, offering safer and more efficient solutions to our transportation needs.


A Unified Bayesian Framework for Pricing Catastrophe Bond Derivatives

Domfeh, Dixon, Chatterjee, Arpita, Dixon, Matthew

arXiv.org Machine Learning

Catastrophe (CAT) bond markets are incomplete and hence carry uncertainty in instrument pricing. As such various pricing approaches have been proposed, but none treat the uncertainty in catastrophe occurrences and interest rates in a sufficiently flexible and statistically reliable way within a unifying asset pricing framework. Consequently, little is known empirically about the expected risk-premia of CAT bonds. The primary contribution of this paper is to present a unified Bayesian CAT bond pricing framework based on uncertainty quantification of catastrophes and interest rates. Our framework allows for complex beliefs about catastrophe risks to capture the distinct and common patterns in catastrophe occurrences, and when combined with stochastic interest rates, yields a unified asset pricing approach with informative expected risk premia. Specifically, using a modified collective risk model -- Dirichlet Prior-Hierarchical Bayesian Collective Risk Model (DP-HBCRM) framework -- we model catastrophe risk via a model-based clustering approach. Interest rate risk is modeled as a CIR process under the Bayesian approach. As a consequence of casting CAT pricing models into our framework, we evaluate the price and expected risk premia of various CAT bond contracts corresponding to clustering of catastrophe risk profiles. Numerical experiments show how these clusters reveal how CAT bond prices and expected risk premia relate to claim frequency and loss severity.


Attention Patterns Detection using Brain Computer Interfaces

Hamza-Lup, Felix G., Suri, Adytia, Iacob, Ionut E., Goldbach, Ioana R., Rasheed, Lateef, Borza, Paul N.

arXiv.org Artificial Intelligence

The human brain provides a range of functions such as expressing emotions, controlling the rate of breathing, etc., and its study has attracted the interest of scientists for many years. As machine learning models become more sophisticated, and bio-metric data becomes more readily available through new non-invasive technologies, it becomes increasingly possible to gain access to interesting biometric data that could revolutionize Human-Computer Interaction. In this research, we propose a method to assess and quantify human attention levels and their effects on learning. In our study, we employ a brain computer interface (BCI) capable of detecting brain wave activity and displaying the corresponding electroencephalograms (EEG). We train recurrent neural networks (RNNS) to identify the type of activity an individual is performing.


DCSVM: Fast Multi-class Classification using Support Vector Machines

Don, Duleep Rathgamage, Iacob, Ionut E.

arXiv.org Machine Learning

DCSVM is a divide and conquer algorithm which relies on data sparsity in high dimensional space and performs a smart partitioning of the whole training data set into disjoint subsets that are easily separable. A single prediction performed between two partitions eliminates at once one or more classes in one partition, leaving only a reduced number of candidate classes for subsequent steps. The algorithm continues recursively, reducing the number of classes at each step, until a final binary decision is made between the last two classes left in the competition. In the best case scenario, our algorithm makes a final decision between k classes in O (log k) decision steps and in the worst case scenario DCSVM makes a final decision in k 1 steps, which is not worse than the existent techniques. 1. Introduction The curse of dimensionality refers to various phenomena that arise when analyzing and organizing data in high-dimensional spaces (often with hundreds or thousands of dimensions) that do not occur in low-dimensional settings such as the three-dimensional physical space of everyday experience. The expression was coined by Richard E. Bellman in a highly acclaimed article considering problems in dynamic optimization [1, 2]. In essence, as dimensionality increases, the volume of the space increases rapidly, and the available data become sparser and sparser.


Spatiotemporal Interpolation Methods for Air Pollution Exposure

Li, Lixin (Georgia Southern University) | Zhang, Xingyou (Centers for Disease Control and Prevention) | Holt, James B. (Centers for Disease Control and Prevention) | Tian, Jie (Georgia Southern University) | Piltner, Reinhard (Georgia Southern University)

AAAI Conferences

This paper investigates spatiotemporal interpolation methods for the application of air pollution assessment. The air pollutant of interest in this paper is fine particulate matter PM2.5. The choice of the time scale is investigated when applying the shape function-based method. It is found that the measurement scale of the time dimension has an impact on the interpolation results. Based upon the comparison between the accuracies of interpolation results, the most effective time scale out of four experimental ones was selected for performing the PM2.5 interpolation. The paper also evaluates the population exposure to the ambient air pollution of PM2.5 at the county-level in the contiguous U.S. in 2009. The interpolated county-level PM2.5 has been linked to 2009 population data and the population with a risky PM2.5 exposure has been estimated. The risky PM2.5 exposure means the PM2.5 concentration exceeding the National Ambient Air Quality Standards. The geographic distribution of the counties with a risky PM2.5 exposure is visualized. This work is essential to understanding the associations between ambient air pollution exposure and population health outcomes.